Streamer#

The streamer module provides functionality for streaming the output of beam search during text generation. It includes the BeamStreamer class, which is designed to work with the LMCSC (Language Model-based Corrector with Semantic Constraints) system.

Key Components#

  • BeamStreamer: A class that extends the BaseStreamer from the Transformers library to handle beam search output streaming.

BeamStreamer#

The BeamStreamer class is a specialized streamer for handling beam search output. It processes the beam search results and provides a streaming interface for accessing the generated text.

Key Features:#

  • Supports streaming of beam search results

  • Handles decoding of tokens into text

  • Provides an iterator interface for easy access to generated text

  • Supports timeout for streaming operations

  • Handles end-of-stream signaling

API Documentation#

class lmcsc.streamer.BeamStreamer(tokenizer: AutoTokenizer, timeout: float | None = None, **decode_kwargs)[source]#

Bases: BaseStreamer

A streamer class that handles beam search output streaming during text generation.

This class extends BaseStreamer to provide functionality for streaming beam search results, processing the beam hypotheses, and providing an iterator interface for accessing the generated text.

Notes

This class only supports batch size 1.

Parameters:
  • tokenizer (AutoTokenizer) – The tokenizer used to decode the tokens into text.

  • timeout (float, optional, defaults to None) – The timeout in seconds for queue operations. If None, queue operations block indefinitely.

  • **decode_kwargs – Additional keyword arguments passed to the tokenizer’s decode method.

tokenizer#

The tokenizer instance used for decoding.

Type:

AutoTokenizer

decode_kwargs#

Additional arguments for token decoding.

Type:

dict

print_len#

Length of previously printed text.

Type:

int

text_queue#

Queue for storing generated text chunks.

Type:

Queue

stop_signal#

Signal used to indicate end of stream.

timeout#

Timeout value for queue operations.

Type:

float

last_text#

Most recently generated text.

Type:

str

Examples

>>> from transformers import AutoModelForCausalLM, AutoTokenizer
>>> from lmcsc.streamer import BeamStreamer
>>>
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> model = AutoModelForCausalLM.from_pretrained("gpt2")
>>> streamer = BeamStreamer(tokenizer)
>>>
>>> # Stream generated text
>>> for text in streamer:
...     print(text)
put(value: BeamSearchScorer)[source]#

Receives tokens, decodes them, and puts the decoded text into the queue.

Parameters:

value (tuple) – A tuple containing (BeamSearchScorer, decoded_text). The BeamSearchScorer contains beam hypotheses and the decoded_text is a list of token IDs.

Raises:

ValueError – If batch size is greater than 1.

end()[source]#

Signals the end of the stream by putting the stop signal in the queue.

on_finalized_text(text: str, stream_end: bool = False)[source]#

Puts finalized text into the queue and handles stream end signaling.

Parameters:
  • text (str) – The text to put in the queue.

  • stream_end (bool, optional) – Whether this is the end of the stream. Defaults to False.